New variable selection methods for zero-inflated count data with applications to the substance abuse field.
نویسندگان
چکیده
Zero-inflated count data are very common in health surveys. This study develops new variable selection methods for the zero-inflated Poisson regression model. Our simulations demonstrate the negative consequences which arise from the ignorance of zero-inflation. Among the competing methods, the one-step SCAD method is recommended because it has the highest specificity, sensitivity, exact fit, and lowest estimation error. The design of the simulations is based on the special features of two large national databases commonly used in the alcoholism and substance abuse field so that our findings can be easily generalized to the real settings. Applications of the methodology are demonstrated by empirical analyses on the data from a well-known alcohol study.
منابع مشابه
Hurdle, Inflated Poisson and Inflated Negative Binomial Regression Models for Analysis of Count Data with Extra Zeros
In this paper, we propose Hurdle regression models for analysing count responses with extra zeros. A method of estimating maximum likelihood is used to estimate model parameters. The application of the proposed model is presented in insurance dataset. In this example, there are many numbers of claims equal to zero is considered that clarify the application of the model with a zero-inflat...
متن کاملBayesian Zero- Inflated Poisson model for prognosis of demographic factors associated with using crystal meth in Tehran population
Background: Use of methamphetamine (MA) and other stimulants has increased steadily over the past 10 years. Risk factor evaluation to reduce the problem in the community is one solution to protect people from addiction. This study aimed at using Bayesian zero- inflated Poisson (ZIP) model to investigate the relationship between the number of using crystal meth and some demogr...
متن کاملVariable selection for zero-inflated and overdispersed data with application to health care demand in Germany.
In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملمقایسه مدل شبکه عصبی مصنوعی با مدلهای رگرسیونی دادههای شمارشی در پیش بینی تعداد دفعات اهدای خون
Background: Modeling is one of the most important ways for explanation of relationship between dependent and independent response. Since data, related to number of blood donations are discrete, to explain them it is better to use discrete variable distribution like Poison or Negative binomial. This research tries to analyze numerical methods by using neural network approach and compare ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistics in medicine
دوره 30 18 شماره
صفحات -
تاریخ انتشار 2011